Combining Efficient XML Compression with Query Processing

نویسندگان

  • Przemyslaw Skibinski
  • Jakub Swacha
چکیده

This paper describes a new XML compression scheme that offers both high compression ratios and short query response time. Its core is a fully reversible transform featuring substitution of every word in an XML document using a semi-dynamic dictionary, effective encoding of dictionary indices, as well as numbers, dates and times found in the document, and grouping data within the same structural context in individual containers. The results of conducted tests show that the proposed scheme attains compression ratios rivaling the best available algorithms, and fast compression, decompression, and query processing.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Compressing and Filtering XML Streams

Information technology is widely adopting the use of XML for information exchange. As messaging standards migrate to XML, there is growing concern for the magnitude of messages compared to binary formatted messages. XML compression can help mitigate the risk of exceeding the capacity of current communication resources. However, it is critical that compression technologies do not hinder XML quer...

متن کامل

MQX: Multi-Query Processing Engine for Compressed XML Data

In this demonstration, we present an XML query engine MQX, which is developed for processing multiple subscribed XPath queries over compressed XML documents. MQX is equipped with efficient mechanisms for query rewriting, query organization and query optimization. To our knowledge, this is the first prototype to address the problem of efficiently processing multiple XML queries in a co-operative...

متن کامل

SpiderX: Fast XML Exploration System

Keyword search in XML has gained popularity as it enables users to easily access XML data without the need of learning query languages and studying complex data schemas. In XML keyword search, query semantics is based on the concept of Lowest Common Ancestor (LCA), e.g., SLCA and ELCA. However, LCA-based search methods depend heavily on hierarchical structures of XML data, which may result in m...

متن کامل

Indexing and Querying Semistructured Data Views of Relational Database

The most promising and dominant data format for data processing and representing on the Internet is the Semistructured data form termed XML. XML data has no fixed schema; it evolved and is self describing which results in management difficulties compared to, for example relational data. XML queries differ from relational queries in that the former are expressed as path expressions. The efficien...

متن کامل

BSBC: Towards a Succinct Data Format for XML Streams

XML data compression is an important feature in XML data exchange, particularly when the data size may cause bottlenecks or when bandwidth and energy consumption limitations require reducing the amount of the exchanged XML data. However, applications based on XML data streams also require efficient path query processing on the structure of compressed XML data streams. We present a succinct repr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007